(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property 
Organization 
International Bureau 

(43) International Publication Date 
11 November 2004 (11.11.2004) 




llllllllllllllllllllllllllllllllll 



(10) Internationa] Publication Number 

PCT WO 2004/097052 A2 



(51) International Patent Classification 7 : C12Q 1768 

(21) Internationa] Application Number: 

PCT/US2004/013587 

(22) Internationa] Filing Date: 29 April 2004 (29.04.2004) 

(25) Filing Language: English 

(26) Publication Language: English 



(30) Priority Data: 
60/466,067 
60/538,246 



29 April 2003 (29.04.2003) US 
23 January 2004 (23.01 .2004) US 



Andrew, J. [US/US]; 20 Baskin Road, Lexington, MA 
02421 (US). 

(74) Agent: VAN DYKE, Raymond; Nixon Peabody LLP, 401 
9th Street, N.W., Washington, DC 20004 (US). 

(81) Designated States ( unless otherwise indicated, for every 
kind of national protection available)', AE, AG, AL, AM, 
AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, 
CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, 
GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, 
KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, 
MG, MK, MN, MW, MX, MZ, NA, NI, NO, NZ, OM, PG, 

ph;pl, pt, ro;ru; sc; sd; se, sg, sk, sl,sy,tj; tmT 

TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, 

zw. 



(71) Applicant (for all designated States except US): WYETH 
[US/US]; 5 Giralda Farms, Madison, NJ 07940 (US). 

(71) Applicant and 

(72) Inventor: STRAHS, Andrew [US/US]; 30 McKinley 
Street, Maynard, MA 01754 (US). 

(72) Inventor: TREPICCHIO, William, L.; 21 Abbott Ridge 
Drive, Andover, MA 01810 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): BURCZYNSKI, 
Michael, E. [US/US]; 71 Franklin Avenue, Swampscott, 
MA 01907 (US). TWINE, Natalie, C [US/US]; 379 
Shirley Hill Road, Goffstown, NH 03045 (US). SLONIM, 
Donna, K. [US/US]; 799 Dale Street, North Andover, 
MA 01845 (US). EVIMERMAN, Fred [US/US]; 194 
Haverstraw Road, Suffern, NY 10901 (US). DORNER, 



(84) Designated States (unless otherwise indicated, for every 
kind of regional protection available): ARIPO (BW, GH, 
GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM, 
ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, 
FR, GB, GR, HU, IE, IT, LU, MC, NL, PL, PT, RO, SE, SI, 
SK, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, 
GW, ML, MR, NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of that report 

— with sequence listing part of description published sepa- 
rately in electronic form and available upon request from 
the International Bureau 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



< 

ON 



(54) Title: METHODS FOR PROGNOSIS AND TREATMENT OF SOLID TUMORS 



© (57) Abstract: Solid tumor prognosis genes, and methods, systems and equipment of using these genes for the prognosis and treat- 
^ ment of solid tumors. Prognosis genes for a solid tumor can be identified by the present invention. The expression profiles of these 
^ genes in peripheral blood mononuclear cells (PBMCs) are correlated with clinical outcome of the solid tumor. The prognosis genes 
^ of the present invention can be used as surrogate markers for predicting clinical outcome of a solid tumor in a patient of interest 
These genes can also be used to select a treatment which has a favorable prognosis for the solid tumor of the patient of interest. 



WO 2004/097052 



PCTAJS2004/013587 



METHODS FOR PROGNOSIS AND TREATMENT OF SOLED TUMORS 

[0001] The present invention incorporates by reference all materials recorded in the 
compact discs labeled "Copy 1 - Sequence Listing Part" "Copy 2 - Sequence Listing Part" 
and "Copy 3 - Sequence Listing Part, " each of which includes "Sequence Listing.ST25.txt" 
(5,454 KB, created April 28, 2004). The present invention also incorporates by reference 
all materials recorded in the compact discs labeled "Copy 1 - Tables Part," "Copy 2 - 
Tables Part," and "Copy 3 - Tables Part," each of which includes the following files: 'Table 
3 - Spearman Correlation of Baseline Expression with Clinical Outcome.txf 5 (298 KB, 
created April 28, 2004), 'Table 4 - Qualifiers and the Corresponding Entrez and Unigene 
Accession Nos.txt" (179 KB, created April 28, 2004), 'Table 5 - Genes and Gene Titles.txf 9 
(331 KB, created April 28, 2004), and 'Table 8 - Cox Regression of Clinical Outcome on 
Baseline Gene Expression.txt" (294 KB, created April 28, 2004). 

CROSS-REFERENCE TO RELATED APPLICATIONS 

[0002] The present application claims priority from and incorporates by reference the 
entire disclosures of U.S. Provisional Patent Application Serial No. 60/466,067, filed April 
29, 2003, and U.S. Provisional Patent Application Serial No. 60/538,246, filed January 23, 
2004. 

TECHNICAL FIELD 

[0003] The present invention relates to solid tumor prognosis genes and methods of 
using these genes for the prognosis or treatment of solid tumors. 

BACKGROUND 

[0004] Expression profiling studies in primary tissues have demonstrated that there 
exist transcriptional differences between normal and malignant tissues. See, for example, 
Su, et al. 9 Cancer Res, 61:7388-7393 (2001); and Ramaswamy, et a/., Proc Natl Acad 
SCI U.S.A., 98:15149-15151 (2001). Recent clinical analyses have also identified 
expression profiles within tumors that appear to be highly correlated with certain measures 
of clinical outcomes. One study has demonstrated that expression profiling of primary 
tumor biopsies yields prognostic "signatures" that rival or may even out-perform currently 
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accepted standard measures of risk in cancer patients. See van de Vijver, et al. 9 N ENGL J 
MED, 347:1999-2009 (2002). 

SUMMARY OF THE INVENTION 

[0005] The present invention provides methods, systems and equipment for prognosis 
or selection of treatment of solid tumors. Prognosis genes for a solid tumor can be 
identified by the present invention. The expression profiles of these genes in peripheral 
blood mononuclear cells (PBMCs) are correlated with clinical outcome of the solid tumor. 
These genes can be used as surrogate markers for predicting clinical outcome of the solid 
tumor in a patient of interest. These genes can also be used to identify or select treatments 
which have favorable prognoses for the patient of interest 

[0006] In one aspect, the present invention provides methods that are useful for the 
prognosis or selection of treatment of a solid tumor in a patient of interest. The methods 
include comparing an expression profile of one or more prognosis genes in a peripheral 
blood sample of the patient of interest to at least one reference expression profile of the 
prognosis genes. Each of the prognosis genes is differentially expressed in PBMCs of a 
first class of patients as compared to PBMCs of a second class of patients. Both classes of 
patients have a solid tumor, and each class of patients has a different clinical outcome. In 
many embodiments, the prognosis genes are substantially correlated with a class distinction 
between the two classes of patients. 

[0007] Solid tumors amenable to the present invention include, but are not limited to, 
renal cell carcinoma (RCC), prostate cancer, head/neck cancer, and other tumors that do not 
have their origin in blood or lymph cells. 

[0008] Clinical outcome can be measured by any clinical indicator. In one 
embodiment, clinical outcome is determined based on clinical classifications such as 
complete response, partial response, minor response, stable disease, progressive disease, 
non-progressive disease, or any combination thereof. In another embodiment, clinical 
outcome is measured by time to disease progression (TTP) or time to death (TTD). In still 
another embodiment, clinical outcome is prognosticated by using traditional risk assessment 
methods, such as Motzer risk classification for RCC. Other patient responses to a 
therapeutic treatment can also be used to measure clinical outcome. Examples of solid 
tumor treatments include, but are not limited to, drug therapy (e.g., CCI-779 therapy), 
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chemotherapy, hormone therapy, radiotherapy, immunotherapy, surgery, genetherapy, anti- 
angiogenesis therapy, palliative therapy, or any combination thereof. 
[0009] In many embodiments, the reference expression profile(s) includes an average 
expression profile of the prognosis genes in peripheral blood samples of reference patients. 
In many instances, the reference patients have the same solid tumor as the patient of 
interest, and the clinical outcome of the reference patients are either known or determinable. 
[001 0] The peripheral blood samples of the patient of interest and reference patients 
can be whole blood samples, or blood samples comprising enriched or purified PBMCs. 
Other types of blood samples can also be employed in the present invention. In one 
embodiment, all of the peripheral blood samples are baseline samples which are isolated 
from respective patients prior to a therapeutic treatment of the patients. 
[0011] . - Any comparison method can be used to compare the expression profile of the r 
patient of interest to the reference expression profile(s). In one embodiment, the 
comparison is based on the absolute or relative peripheral blood expression level of each 
prognosis gene. In another embodiment, the comparison is based on the ratios between 
expression levels of two or more prognosis genes. In yet another embodiment, the reference 
expression profiles include at least two distinct expression profiles, each being derived from 
a different class of reference patients. The comparison of the expression profile of the 
patient of interest to the reference expression profiles can be carried out by using methods 
including, but not limited to, hierarchical clustering, ^-nearest-neighbors, or weighted- 
voting algorithm. 

10012] In still another embodiment, the methods of the present invention include 
selecting a treatment which has a favorable prognosis for the solid tumor in the patient of 
interest. 

[0013] In another aspect, the present invention provides other methods useful for the 
prognosis or selection of treatment of a solid tumor in a patient of interest. These methods 
include comparing an expression profile of one or more prognosis genes in a peripheral 
blood sample of the patient of interest to at least one reference expression profile of the 
prognosis genes, where each of the prognosis genes is differentially expressed in PBMCs of 
a first class of patients as compared to PBMCs of a second class of patients. Each of the 
first and second classes is a subcluster formed by an unsupervised clustering analysis of 
gene expression profiles in PBMCs of patients who have the solid tumor. In one 
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embodiment, the majority of the first class of patients has a first clinical outcome, and the 
majority of the second class of patients has a second clinical outcome, 
[0014] In yet another aspect, the present invention further provides methods useful for 
the prognosis or selection of treatment of a solid tumor in a patient of interest. The methods 
include comparing an expression profile of one or more prognosis genes in a peripheral 
blood sample of the patient of interest to at least one reference expression profile of the 
prognosis genes, where the expression levels of each of the prognosis genes in PBMCs of 
patients having the solid tumor are correlated with clinical outcomes of these patients. The 
association between PBMC expression levels and clinical outcome can be determined by a 
statistical method (e.g., Spearman's rank correlation or Cox proportional hazard regression 
model) or a class-based correlation metric (e.g., neighborhood analysis). In one 

.embodiment, the solid tumor is RCC, and clinical outcome is measured by patient response 

to a CCI-779 therapy. In another embodiment, the prognosis genes include at least one gene 
selected from Tables 6a, 6b, 6c, 6d, 9a, 9b, 9c, 9d, 10, 11, 12, 13, 16, 20, and 21. 
[0015] The present invention also features systems useful for the prognosis or 
selection of treatment of a solid tumor in a patient of interest. The systems include (1) a 
memory or a storage medium comprising data that represent an expression profile of one or 
' more prognosis genes in a peripheral blood sample of the patient of interest, (2) a storage 

, medium^c omprising data, that represent at least one reference expression profile of the 

prognosis genes, (3) a program capable of comparing the expression profile of the patient of 
interest to the reference expression profile, and (4) a processor capable of executing the 
program. The expression levels of the prognosis genes in PBMCs of patients having the 
solid tumor are correlated with clinical outcomes of the patients. 

[0016] Moreover, the present invention features nucleic acid or protein arrays useful 
for the prognosis or selection of treatment of a solid tumor in a patient of interest The 
nucleic acid or protein arrays include concentrated probes for solid tumor prognosis genes. 
[0017] Other features, objects, and advantages of the present invention are apparent in 
the detailed description that follows. It should be understood, however, that the detailed 
description, while indicating embodiments of the present invention, is given by way of 
illustration only, not limitation. Various changes and modifications within the scope of the 
invention will become apparent to those skilled in the art from the detailed description. 
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[0018] The drawings are provided for illustration, not limitation. All drawings in the 
parallel U.S. patent application, entitled "Methods for Prognosis and Treatment of Solid 
Tumors" and filed April 29, 2004, are incorporated herein by reference. 
[0019] Figure 1A depicts expression profiles of class-correlated genes identified by 
nearest-neighbor analysis of patients with survival of less than 150 days versus patients with 
survival of greater than 550 days. The relative expression levels of the class-correlated 
genes (rows) are indicated for each patient (columns) according to the normalized 
expression level scale. 

[0020] Figure IB shows the comparison of the signal to noise (S2N) similarity metric 
scores for class-correlated genes identified in Figure 1 A relative to S2N scores for the top 
1%, 5%, and 50% of scores for class-correlated genes resulting from randomly permuted 

■ •*" ' ' data sets. • 

[0021] Figure 1C illustrates training set cross validation results for predictor gene sets 
of increasing size. Each predictor set was evaluated by cross validation to identify the 
predictor set with the highest accuracy for classification of the samples. In these analyses, a 
58 gene predictor set (77% accuracy) was the optimal classifier. 

[0022] Figure ID demonstrates cross validation results for each sample using the 58- 
gene predictor identified in Figure 1C. A leave-one-out cross validation was performed and 

. .^the prediction strengths were calculated for each sample in the analysis. For the purposes of 

illustration, confidence scores accompanying calls of 'TTD > 550 days" were assigned 
positive values, while prediction strengths accompanying calls of "TTD < 150 days" were 
assigned negative values. 

[0023] Figure 2A shows the relative gene expression levels of a 42-gene classifier for 
the comparison of patients with intermediate versus poor Motzer risk classification. 
[0024] Figure 2B shows the relative gene expression levels for an 18-gene classifier 
identified in the comparison of patients with progressive disease versus any other clinical 
response. 

[0025] Figure 2C demonstrates the relative gene expression levels for a 6-gene 
classifier identified in the comparison of patients in the lower versus upper quartiles of time 
to disease progression. 

[0026] Figure 2D shows the relative gene expression levels for a 52-gene classifier 
identified in the comparison of patients in the lower versus upper quartiles of survival/time 
to death. 
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[0027] Figure 2E depicts the relative expression levels for a 12-gene classifier 
identified in the comparison of patients with early (time to disease progression < 106 days) 
versus all other times to disease progression (TTP > 106 days). 

[00281 Figure 3A illustrates the dendrogram of an unsupervised hierarchical 
clustering of baseline PBMC profiles in 45 RCC patients using all expressed genes present 
in at least one sample and possessing a frequency of greater than 10 ppm in at least one 
sample (5,424 genes total). PBMC expression profiles in the poor prognosis cluster are 
indicated by subcluster "A," where 9 out of 12 patients with PBMC profiles in this 
subcluster exhibited survival of less than a year. PBMC expression profiles in the good 
prognosis cluster are indicated by subcluster "C," where 10 out of 12 patients with PBMC 
profiles in this subcluster exhibited survival of greater than a year. The median survival for 
patients in subclusters A, B, C, and D is 281 days, 566 days, 573 days, and 502 days, 
respectively. 

[0029] Figure 3B shows baseline expression profiles of selected genes in RCC 
patients. The dendrogram of sample relatedness is indicated 

[0030] Figure 4A illustrates the Kaplan-Meier survival curve for patients in the poor 
and good prognosis subclusters segregated on the basis of gene expression pattern. 
[0031] Figure 4B illustrates the Kaplan-Meier survival curve for patients in the poor 
and g ood prognosis subclusters segregated on the basis of Motzer risk assessment. 
[0032] Figure 5A demonstrates the result of supervised identification of a gene 
classifier for assigning class membership to patients in the good and poor prognosis 
subclusters. The relative expression levels of the most class-correlated gene (rows) are 
indicated for each patient (columns) according to the scale described in Figure 1 A. 
[0033] Figure 5B shows cross validation results for each sample using the gene 
classifier of Figure 5 A. A leave-one-out cross validation was performed and the confidence 
scores were calculated for each sample in the analysis. Similar to Figure ID, for the 
purposes of illustration, prediction strengths accompanying calls of "survival > 1 year" were 
assigned positive values, while prediction strengths accompanying calls of "survival < 1 
year" were assigned negative values. Asterisks identify the false positives in this clinical 
assay designed to identify short survival times, and arrowheads indicate false negatives. 
[0034] Figure 6A shows the optimal gene classifier for year-long survival identified 
by nearest-neighbor analysis using a more stringent filter (at least 25% present calls, and an 
average frequency no less than 5 ppm). A GeneCluster gene selection approach identifies 
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genes distinguishing patients with survival less than 365 days versus patients with survival 
greater than 365 days in the training set. The relative expression levels of the most class- 
correlated genes (rows) are indicated for each of the patients in the training set (columns) 
according to the scale described in Figure 1 A. 

[0035] Figure 6B evaluates prediction accuracy of gene classifiers of increasing size. 
Accuracy of class assignment for gene classifiers containing between 2 and 60 genes in 
steps of 2, and 60-200 genes in steps of 10, were evaluated by leave-one-out cross 
validation on the training set of samples. The smallest predictive model with the highest 
accuracy was selected (20 gene predictor, indicated by the arrow). 

[0036] Figure 6C demonstrates the result of evaluation of the optimal predictive 
model of Figure 6B on an untested set of RCC PBMC profiles. A ^-nearest-neighbors 
algorithm using the 20 gene classifier, was used to assign class membership to the remaining 
14 PBMC profiles, and the prediction strengths associated with the class assignments are 
presented for each sample in the analysis. For the purposes of illustration, confidence 
scores accompanying calls of 'TTD < 365 days" were assigned positive values, while 
confidence scores accompanying calls of "TTD > 365 days" were assigned negative values. 
The overall accuracy of the gene classifier was 72%. By defining the clinical assay as the 
identification of favorable outcome, eight of eight patients with favorable outcome were 
^ns^J^entifieikasJiaMng^s year (positive predictive value of 

100%). 

[0037] Figure 7A illustrates the optimal gene classifier for greater than 106 day time 
to progression identified by nearest-neighbor analysis using a more stringent filter (at least 
25% present calls, and an average frequency no less than 5 ppm). A GeneCluster gene 
selection approach identifies genes distinguishing patients with TTP less than 106 days 
versus patients with TTP greater than 106 days in the training set. The relative expression 
levels of the most class-correlated genes (rows) are indicated for each of the patients in the 
training set (columns) according to the scale of Figure 1 A. 

[0038] Figure 7B indicates prediction accuracy of gene classifiers of increasing size. 
Accuracy of class assignment for gene classifiers containing between 2 and 60 genes in 
steps of 2, and 60-200 genes in steps of 10, were evaluated by leave-one-out cross 
validation on the framing set of samples. The smallest predictive model with the highest 
accuracy was selected (30 gene predictor, indicated by the arrow). 
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[0039] Figure 7C shows the result of evaluation of the optimal predictive model of 
Figure 7B on an untested set of RCC PBMC profiles. A ^-nearest-neighbors algorithm 
using the 30 gene classifier was used to assign class membership to the remaining 14 
PBMC profiles, and the prediction strengths associated with the class assignments are 
presented for each sample in the analysis. For the purposes of illustration, confidence 
scores accompanying calls of "TTP < 106 days" were assigned positive values, while 
confidence scores accompanying calls of "TTD > 106 days" were assigned negative values. 
The overall accuracy of the gene classifier was 85%. By defining the clinical assay as the 
identification of favorable outcome, eight of ten patients with favorable outcome were 
correctly identified as having TTP greater than one 106 days (positive predictive value of 
80%) and three of three patients with poor outcome were correctly predicted to have TTP 
less than 1 06 days.(negative predictive value 100%). . „ ..... 



DETAILED DESCRIPTION 

[0040] The present invention provides methods that are useful for prognosis or 
selection of treatment of solid tumors. These methods employ prognosis genes that are 
differentially expressed in peripheral blood samples of solid tumor patients who have 
different clinical outcomes. In many embodiments, the peripheral blood expression profiles 
of these prognosis genes are correlated with patients' clinical outcome or prognosis under a 
statistical method or a correlation model. In many other embodiments, solid tumor patients 
can be divided into at least two classes based on patients' clinical outcome or prognosis, and 
the prognosis genes are substantially correlated with a class distinction between these two 
classes of patients under a neighborhood analysis. 

[0041] The prognosis genes of the present invention can be used as surrogate markers 
for the prediction of clinical outcome of solid tumors. The prognosis genes of the present 
invention can also be used for the identification of optimal treatments of solid tumors. 
Different patients may have distinct clinical responses to a therapeutic treatment due to 
individual heterogeneity of the molecular mechanism of the disease. The identification of 
gene expression patterns that correlate with patient response allows clinicians to select 
treatments based on predicted patient responses and thereby avoid adverse reactions: This 
provides improved power and safety of clinical trials and increased benefit/risk ratio for 
drugs and other therapeutic treatments. Peripheral blood is a tissue that can be routinely 
obtained from patients in a minimally invasive manner. By detennining the correlation 
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between patient outcome and gene expression profiles in peripheral blood samples, the 
present invention represents a significant advance in clinical pharmacogenomics and solid 
tumor treatment. 

10042] Various aspects of the invention are described in further detail in the following 
subsections. The use of subsections is not meant to limit the invention. Each subsection 
may apply to any aspect of the invention. In this application, the use of "or" means 
"and/or" unless stated otherwise. 

I. General Methods for IdentitVing Solid Tu mor Prognosis Genes 

[0043] Previous studies demonstrated that baseline expression profiles in PBMCs 
... -from solid tumor patients were significantly distinct from those._of disease-free. subject^ 
See U.S. Provisional Application Serial No. 60/459,782, filed April 3, 2003, U.S. 
Provisional Application Serial No. 60/427,982, filed November 21, 2002, and U.S. Patent 
Application Serial No. 10/717,597, filed November 21, 2003, all of which are incorporated 
herein by reference. Studies also showed that gene expression profiles in PBMCs were 
predictive of anti-cancer drug activity in vivo. See U.S. Provisional Application Serial No. 
60/446,133, filed February 11, 2003, and U.S. Patent Application Serial No. 10/775,169, 
filed FftWnary 11, 2004. both of which are incorporated herein by reference. In addition, 
studies indicated that PBMC baseline expression profiles were correlated with clinical 
outcomes of RCC or other non-blood diseases. See U.S. Provisional Application Serial No. 
60/466,067, filed April 29, 2003, which is incorporated herein by reference. 
[0044] The present invention further evaluates the correlation between peripheral 
blood gene expression and clinical outcome of solid tumors. Prognosis genes for a variety 
of solid tumors can be identified by the present invention. These genes are differentially 
expressed in peripheral blood samples of solid tumor patients who have different clinical 
outcomes. In many embodiments, the peripheral blood expression profiles of the prognosis 
genes of the present invention are correlated with patient outcome under statistical methods 
or correlation models. Exemplary statistical methods and correlation models include, but 
are not limited to, Spearman's rank correlation, Cox proportional hazard regression model, 
ANOVA/t test, nearest-neighbor analysis, and other rank tests, survival models or class- 
based correlation metrics. 
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[0045] Solid tumors amenable to the present invention include, without limitation, 
RCC, prostate cancer, head/neck cancer, ovarian cancer, testicular cancer, brain tumor, 
breast cancer, lung cancer, colon cancer, pancreas cancer, stomach cancer, bladder cancer, 
skin cancer, cervical cancer, uterine cancer, and liver cancer. In one embodiment, the solid 
tumors do not have their origin in blood or lymph (hematopoetic) cells. Solid tumors can be 
measured or evaluated using direct or indirect visualization procedures. Suitable 
visualization methods include, but are not limited to, scans (such as X-rays, computerized 
axial tomography (CT), magnetic resonance imaging (MRI), positron emission tomography 
(PET), or ultrasonography (U/S)), biopsy, palpation, endoscopy, laparoscopy, and other 
suitable means as appreciated by those skilled in the art. 

[0046] Clinical outcome of solid tumors can be assessed by numerous criteria. In 
many embodiments, clinical outcome is .assessed, based, on patiOTte^response to a 
therapeutic treatment. Examples of clinical outcome measures include, without limitation, 
complete response, partial response, minor response, stable disease, progressive disease, 
time to disease progression (TTP), time to death (TTD or Survival), or any combination 
thereof. Examples of solid tumor treatments include, without limitation, drug therapy (e.g., 
CCI-779 therapy), chemotherapy, hormone therapy, radiotherapy, immunotherapy, surgery, 
gene therapy, anti-angiogenesis therapy, palliative therapy, or any combination thereof, or 
-. _other conventional, or non-conventional therapies. 

[0047] In one embodiment, clinical outcome is evaluated based on the WHO 
Reporting Criteria, such as those described in WHO Publication, No. 48 (World Health 
Organization, Geneva, Switzerland, 1979). Under the Criteria, uni- or bidimensionally 
measurable lesions are measured at each assessment. When multiple lesions are present in 
any organ, up to 6 representative lesions can be selected, if available. 
[0048] In another embodiment, clinical outcome is determined based on a 
classification system composed of clinical categories such as complete response, partial 
response, minor response, stable disease, progressive disease, or any combination thereof. 
"Complete response" (CR) means complete disappearance of all measurable and evaluable 
disease, determined by two observations not less than 4 weeks apart. There is no new lesion 
and no disease related symptom. "Partial response" (PR) in reference to bidimensionally 
measurable disease means decrease by at least about 50% of the sum of the products of the 
largest perpendicular diameters of all measurable lesions as determined by 2 observations 
not less than 4 weeks apart. "Partial response" in reference to unidimensionally measurable 
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disease means decrease by at least about 50% in the sum of the largest diameters of all 
lesions as determined by 2 observations not less than 4 weeks apart. It is not necessary for 
all lesions to have regressed to qualify for partial response, but no lesion should have 
progressed and no new lesion should appear. The assessment should be objective. "Minor 
response" in reference to bidimensionally measurable disease means about 25% or greater 
decrease but less than about 50% decrease in the sum of the products of the largest 
perpendicular diameters of all measurable lesions. "Minor response" in reference to 
unidimensionally measurable disease means decrease by at least about 25% but less than 
about 50% in the sum of the largest diameters of all lesions. 

[00491 "Stable disease" (SD) in reference to bidimensionally measurable disease 
means less than about 25% decrease or less than about 25% increase in the sum of the 
products of the largest perpendicular diameters of all measurable lesions. "Stable disease" 
in reference to unidimensionally measurable disease means less than about 25% decrease or 
less than about 25% increase in the sum of the diameters of all lesions. No new lesions 
should appear. "Progressive disease" (PD) refers to a greater than or equal to about a 25% 
increase in the size of at least one bidimensionally (product of the largest perpendicular 
diameters) or unidimensionally measurable lesion or appearance of a new lesion. The 
occurrence of pleural effusion or ascites is also considered as progressive disease if this is 
substantiated by positive cytology. Pathological fracture or collapse of bone is not 
necessarily evidence of disease progression. 

[0050] In yet another embodiment, overall subject tumor response for uni- and 
bidimensionally measurable disease is determined according to Table 1. 



Table 1. Overall Subject Tumor Response 



Response in 
Bidimensionally 
Measurable Disease 


Response in 
Unidimensionally 
Measurable Disease 


Overall Subject 
Tumor Response 


PD 


Any 


PD 


Any 


PD 


PD 


SD 


SDorPR 


SD 


SD 


CR 


PR 


PR 


SDorPRoxCR 


PR 


CR 


SDorPR 


PR 


CR 


CR 


CR 
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[0051] Overall subject tumor response for non-measurable disease can be assessed, 
for instance, in the following situations: 

a) Overall complete response: if non-measurable disease is present, it 
should disappear completely. Otherwise, the subject cannot be considered as an "overall 
complete responder." 

b) Overall progression: in case of a significant increase in the size of non- 
measurable disease or the appearance of a new lesion, the overall response will be 
progression. 

[0052] Clinical outcome can also be assessed by other criteria. For instance, clinical 
outcome can be measured by TTP or TTD. TTP refers to the interval from the date of 
initiation of a therapeutic treatment until the first day of measurement of progressive 
disease. TTD refers to the interval from the date of initiation of a therapeutic treatment to 
the time of death, or censored at the last date known alive. 

[0053] Moreover, clinical outcome can include prognoses based on traditional clinical 
risk assessment methods. In many cases, these risk assessment methods employ numerous 
prognostic factors to classify patients into different prognosis or risk groups. One example 
is Motzer risk assessment for RCC, as described in Motzer, et ah, J Clin Oncol, 17:2530- 
2540 (1999), Patients in different risk groups may have different responses to a therapy. 
[0054] P eripheral blood samples employed in the present invention can be isolated 
from solid tumor patients at any disease or treatment stage. In one embodiment, the 
peripheral blood samples are isolated from solid tumor patients prior to a therapeutic 
treatment. These blood samples are "baseline samples" with respect to the therapeutic 
treatment. 

[0055] A variety of peripheral blood samples can be used in the present invention. In 
one embodiment, the peripheral blood samples are whole blood samples. In another 
embodiment, the peripheral blood samples comprise enriched PBMCs. By "enriched," it 
means that the percentage of PBMCs in the sample is higher than that in whole blood. In 
some cases, the PBMC percentage in an enriched sample is at least 1, 2, 3, 4, 5 or more 
times higher than that in whole blood. In some other cases, the PBMC percentage in an 
enriched sample is at least 90%, 95%, 98%, 99%, 99.5%, or more. Blood samples 
containing enriched PBMCs can be prepared using any method known in the art, such as 
Ficoll gradients centrifugation or CPTs (cell purification tubes). 
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10056] The relationship between peripheral blood gene expression profiles and patient 
outcome can be evaluated using global gene expression analyses. Methods suitable for this 
purpose include, but are not limited to, nucleic acid arrays (such as cDNA or 
oligonucleotide arrays), 2-dimensional . SDS-polyacrylamide gel electrophoresis/mass 
spectrometry, and other high throughput nucleotide or polypeptide detection techniques. 
[0057] Nucleic acid arrays allow for quantitative detection of the expression levels of 
a large number of genes at one time. Examples of nucleic acid arrays include, but are not 
limited to, Genechip® microarrays from Affymetrix (Santa Clara, CA), cDNA microarrays 
from Agilent Technologies (Palo Alto, CA), and bead arrays described in U.S. Patent Nos. 
6,288,220 and 6,391 ,562. 

[0058] The polynucleotides to be hybridized to nucleic acid arrays can be labeled 
with one or more labeling moieties to allow for detection of hybridized polynucleotide 
complexes. The labeling moieties can include compositions that are detectable by 
spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, 
optical or chemical means. Exemplary labeling moieties include radioisotopes, 
chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic 
markers such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass 
spectrometry tags, spin labels, electron transfer donors and acceptors, and the like. 
Unlabeled polynucleotides can also be employed. The polynucleotides can be DNA, RNA, 
or a modified form thereof. 

[0059] Hybridization reactions, can be performed in absolute or differential 
hybridization formats. In the absolute hybridization format, polynucleotides derived from 
one sample, such as PBMCs from a patient in a selected outcome class, are hybridized to 
the probes on a nucleic acid array. Signals detected after the formation of hybridization 
complexes correlate to the polynucleotide levels in the sample. In the differential 
hybridization format, polynucleotides derived from two biological samples, such as one 
from a patient in a first outcome class and the other from a patient in a second outcome 
class, aie labeled with different labeling moieties. A mixture of these differently labeled 
polynucleotides is added to a nucleic acid array. The nucleic acid array is then examined 
under conditions in which the emissions from the two different labels are individually 
detectable. In one embodiment, the fluorophores Cy3 and Cy5 (Amersham Pharmacia 
Biotech, Piscataway N. J.) are used as the labeling moieties for the differential hybridization 
format. 
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[0060] Signals gathered from nucleic acid arrays can be analyzed using commercially 
available software, such as those provide by Affymetrix or Agilent Technologies. Controls, 
such as for scan sensitivity, probe labeling and cDNA/cRNA quantitation, can be included 
in the hybridization experiments. In many embodiments, the nucleic acid array expression 
signals are scaled or normalized before being subject to further analysis. For instance, the 
expression signals for each gene can be normalized to take into account variations in 
hybridization intensities when more than one array is used under similar test conditions. 
Signals for individual polynucleotide complex hybridization can also be normalized using 
the intensities derived from internal normalization controls contained on each array. In 
addition, genes with relatively consistent expression levels across the samples can be used 
to normalize the expression levels of other genes. In one embodiment, the expression levels 
of the genes are normalized across the samples such that the mean is zero and the standard 
deviation is one. In another embodiment, the expression data detected by nucleic acid 
arrays are subject to a variation filter which excludes genes showing minimal or 
insignificant variation across all samples. 

[0061] The gene expression data collected from nucleic acid arrays can be correlated 
with clinical outcome using a variety of methods. Suitable correlation methods include, but 
are not limited to, statistical methods (such as Spearman's rank correlation, Cox 
proportional hazard regression model, ANOVA/t test, or other suitable rank tests or survival 
models) and class-based correlation metrics (such as nearest-neighbor analysis). 
[0062] In one aspect, class-based correlation metrics are used to identify the 
correlation between peripheral blood gene expression and clinical outcome. In one 
embodiment, patients with a specified solid tumor are divided into at least two classes based 
on their clinical stratifications. The correlation between peripheral blood gene expression 
(e.g., in PBMCs) and clinical outcome is analyzed by a supervised cluster algorithm. 
Exemplary supervised clustering algorithms include, but are not limited to, nearest-neighbor 
analysis, support vector machines, and SPLASH. Under the supervised cluster algorithms, 
clinical outcome of each class of patients is either known or determinable. Genes that are 
differentially expressed in peripheral blood cells (e.g., PBMCs) of one class of patients 
relative to the other class of patients can be identified. In many cases, the genes thus 
identified are substantially correlated with a class distinction between the two classes of 
patients. The genes thus identified can be used as surrogate markers for predicting clinical 
outcome of the solid tumor in a patient of interest. 
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[0063] In another embodiment, patients with a specified solid tumor can be divided 
into at least two classes based on gene expression profiles in their peripheral blood cells. 
Methods suitable for this purpose include unsupervised clustering algorithms, such as self- 
organized maps (SOMs), k-means, principal component analysis, and hierarchical 
clustering. A substantial number (e.g., at least 50%, 60%, 70%, 80%, 90%, or more) of 
patients in one class may have a first clinical outcome, and a substantial number of patients 
in the other class may have a second clinical outcome. Genes that are differentially 
expressed in the peripheral blood cells of one class of patients relative to the other class of 
patients can be identified. These genes are prognosis genes for the solid tumor. 
[0064] In yet another embodiment, patients with a specified solid tumor can be 
divided into three or more classes based on their clinical stratifications or peripheral blood 
gene expression profiles. Multi-class correlation metrics can be employed to identify genes 
that are differentially expressed in these classes. Exemplary multi-class correlation metrics 
include, but are not limited to, GeneCluster 2 software provided by MIT Center for Genome 
Research at Whitehead Institute (Cambridge, MA). 

[0065] In a further embodiment, nearest-neighbor analysis (also known as 
neighborhood analysis) is used to analyze gene expression data gathered from nucleic acid 
arrays. The algorithm for neighborhood analysis is described in Golub, et al., Science, 
286: 531-537 (1999) , Slonim, et a/., Procs. of the Fourth Annual International 
Conference on Computational Molecular Biology, Tokyo, Japan, April 8-11, p263- 
272 (2000), and U.S. Patent No. 6,647,341, all of which are incorporated herein by 
reference. Under one form of the neighborhood analysis, the expression profile of each 
gene can be represented by an expression vector g = (ei, e2, e3, . . e n ), where ej corresponds 
to the expression level of gene "g" in the iih sample. A class distinction can be represented 
by an idealized expression pattern c = (ci, c 2 , c 3 , . . Cn), where q = 1 or -1, depending on 
whether the ith sample is isolated from class 0 or class 1. Class 0 may include patients 
having a first clinical outcome, and class 1 includes patients having a second clinical 
outcome. Other forms of class distinction can also be employed. Typically, a class 
distinction represents an idealized expression pattern, where the expression level of a gene 
is uniformly high for samples in one class and uniformly low for samples in the other class. 
[0066] The correlation between gene "g" and the class distinction can be measured by 
a signal-to-noise score: 

P(g>c) = Qii(g) - ^(g)]/[*i(g) + o 2 (g)] 
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where jij(g) and ^ 2 (g) represent the means of the log-transformed expression levels of gene 
u g" in class 0 and class 1, respectively, and oi(g) and a 2 (g) represent the standard deviation 
of the log-transformed expression levels of gene "g" in class 0 and class 1, respectively, A 
higher absolute value of a signal-to-noise score indicates that the gene is more highly 
expressed in one class than in the other. In one embodiment, the samples used to derive the 
signal-to-noise score comprise enriched or purified PBMCs. Thus, the signal-to-noise score 
P(g,c) can represent a correlation between the class distinction and the expression level of 
gene "g" in PBMCs. 

[00671 The correlation between gene "g" and the class distinction can also be 
measured by other methods, such as by the Pearson correlation coefficient or the Euclidean 
distance, as appreciated by those skilled in the art. 

[0068] The significance of the correlation between peripheral blood gene expression 
patterns and the class distinction can be evaluated using a random permutation test. An 
unusually high density of genes within the neighborhoods of the class distinction, as 
compared to random patterns, suggests that many genes have expression patterns that are 
significantly correlated with the class distinction. The correlation between genes and the 
class distinction can be diagrammatically viewed through a neighborhood analysis plot, in 
which the y-axis represents the number of genes within various neighborhoods around the 
_class distinction and the x-axis indicates the size of the neighborhood (i.e., P(g,c)). Curves 
showing different significance levels for the number of genes within corresponding 
neighborhoods of randomly permuted class distinctions can also be included in the plot. 
[0069] In one embodiment, the prognosis genes of the present invention are 
substantially correlated with a class distinction between two outcome classes. In one 
example, the prognosis genes are above the median significance level in the neighborhood 
analysis plot. This means that the correlation measure P(g,c) for each prognosis gene is 
such that the number of genes within the neighborhood of the class distinction having the 
size of P(g,c) is greater than the number of genes within the corresponding neighborhoods 
of randomly permuted class distinctions at the median significance level. In another 
example, the employed prognosis genes are above the 10%, 5%, 2%, or 1% significance 
level. As used herein, x% significance level means that x% of random neighborhoods 
contain as many genes as the real neighborhood around the class distinction. 
[0070J Class predictors can be constructed using the prognosis genes of the present 
invention. These class predictors are useful for assigning class membership to solid tumor 
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patients. In one embodiment, the prognosis genes in a class predictor are limited to those 
shown to be significantly correlated with the class distinction by the permutation test, such 
as those at above the 1%, 2%, 5%, 10%, 20%, 30%, 40%, or 50% significance level. In 
another embodiment, the expression level of each prognosis gene in a class predictor is 
substantially higher or substantially lower in PBMCs of one class of patients than in the 
other class of patients. In still another embodiment, the prognosis genes in a class predictor 
have top absolute values of P(g,c). In yet another embodiment, the p-value under a 
Student's f-test (e.g., two-tailed distribution, two sample unequal variance) for each 
differentially expressed prognosis gene is no more than 0.05, 0.01, 0.005, 0.001, 0.0005, 
0.0001, or less. 

[0071] In a further, embodiment, the class predictors of the present invention have at 
least 50% accuracy for leave-one-out cross validation. In another embodiment, the class 
predictors of the present invention have at least 60%, 70%, 80%, 90%, 95%, or 99% 
accuracy for leave-one-out cross validation. 

[0072] In another aspect, the correlation between peripheral blood gene expression 
profiles and clinical outcome can be evaluated by statistical methods. Clinical outcome 
suitable for these analyses includes, but are not limited to, TTP, TTD, and other time- 
associated clinical indicators. One exemplary statistical method employs Spearman's rank 
correlation coefficient, which has the formula of: 

r s =SSuv/(SSuuSSw) 1/2 
where SSuv = 2 UjV, - [(2 Uj)(Z Vi)]/n, SSuu - 2 V, 2 - [(2 Vi) 2 ]/n, and SSyv = 2 Uj 2 - [(2 
U0 2 ]/n. Ui is the expression level ranking of a gene of interest, Vj is the ranking of the 
clinical outcome, and n represents the number of patients. The shortcut formula for 
Spearman's rank correlation coefficient is r s =l -(6x2 d 2 )/[n(n 2 -l)], where dj = Ui - V;. 
The Spearman's rank correlation is similar to the Pearson's correlation except that it is 
based on ranks and is thus more suitable for data that is not normally distributed. See, for 
example, Snedecor and Cochran, Statistical Methods, Eight edition, Iowa State 
University Press, Ames, Iowa, 503 pp, 1989. The correlation coefficient is tested to assess 
whether it differs significantly from a value of 0 (i.e., no correlation). 
[0073] The correlation coefficients for each prognosis gene identified by the 
Spearman's rank correlation can be either positive or negative, provided that the correlation 
is statistically significant In many embodiments, the p-value for each prognosis gene thus 
identified is no more than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less. In many other 
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embodiments, the Spearman correlation coefficients of the prognosis genes thus identified 
have absolute values of at least 03, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or more. 
[0074] Another exemplary statistical method is Cox proportional hazard regression 
model, which has the formula of: 

loghi^^aO + PjXij 

where h,(t) is the hazard function that assesses the instantaneous risk of demise at time t, 
conditional on survival to that time, a(t) is the baseline hazard function, and xy is a covariate 
which may represent, for example, the expression level of prognosis gene j in a peripheral 
blood sample. See Cox, Journal of the Royal Statistical Society, Series B 34:187 
(1972). Additional covariates, such as interactions between covariates, can also be included 
in Cox proportional hazard model. As used herein, the terms "demise" or "survival" are not 
limited to real death or survival. Instead, these terms should be interpreted broadly to cover 
any type of time-associated events, such as TTP. In many cases, the p-values for the 
correlation under Cox proportional hazard regression model are no more than 0.05, 0.01, 
0.005, 0.001, 0.0005, 0.0001, or less. The p-values for the prognosis genes identified under 
Cox proportional hazard regression model can be determined by the likelihood ratio test, 
Wald test, the Score test, or the log-rank test. In one embodiment, the hazard ratios for the 
prognosis genes thus identified are at least 1.5, 2, 3, 4, 5, or more. In another embodiment, 
the hazard ratios for the prognosis genes thus identified are no more than 0.67, 0.5., 0.33, 
0.25., 0.2, or less. 

[0075] Other rank tests, scores, measurements, or models can also be employed to 
identify prognosis genes whose expression profiles in peripheral blood samples are 
correlated with clinical outcome of solid tumors. These tests, scores, measurements, or 
models can be either parametric or nonparametric, and the regression may be either linear or 
non-linear. Many statistical methods and correlation/regression models can be carried out 
using commercially available programs. 

[0076] Other methods capable of identifying genes differentially expressed in 
peripheral blood cells of one class of patients relative to another class of patients can be 
used. These methods include, but are not limited, RT-PCR, Northern Blot, in situ 
hybridization, and immunoassays such as ELISA, RIA or Western Blot. The expression 
levels of genes thus identified can be substantially higher or substantially lower in 
peripheral blood cells (e.g., PBMCs) of one class of patients than in another class of 
patients. In some cases, the average peripheral blood expression level of a prognosis gene 
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in PBMCs of one class of patients can be at least 2, 3, 4, 5, 10, 20, or more folds higher or 
lower than that in another class of patients. In many embodiments, the p-value of an 
appropriate statistical significance test (e.g., Student's t-test) for the difference between 
average expression levels is no more than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less. 
[0077] Prognosis genes for other non-blood diseases can be similarly identified 
according to the present invention, provided that the correlation between peripheral blood 
gene expression and clinical outcome of these diseases is statistically significant. The 
peripheral blood expression patterns of the prognosis genes thus identified are indicative of 
clinical outcome of these diseases. 

II. Identification of RCC Prognosis Genes 

[0078] RCC comprises the majority of all cases of kidney cancer and is one of the ten 
most common cancers in industrialized countries, comprising 2% of adult malignancies and 
2% of cancer-related deaths. Several prognostic factors and scoring indices have been 
developed for patients diagnosed with RCC, typified by multivariate assessments of several 
key indicator^. As an example, one prognostic scoring system employs the five prognostic 
factors proposed by Motzer, et a/., supra- namely, Karnofsky performance status, serum 
lactate dehydrognease, hemoglobin, serum calcium, and presence/absence of prior 
nephrectomy. 

[0079] The present invention identifies numerous RCC prognosis genes whose 
peripheral blood expression profiles correlate with patient outcome in CCI-779 therapy. In 
a clinical trial, the cytostatic mTOR inhibitor CCI-779 was evaluated in RCC patients for its 
anti-cancer effect. PBMCs collected prior to CCI-779 therapy were analyzed on 
oligonucleotide arrays in order to determine whether mononuclear cells from RCC patients 
possessed transcriptional patterns predictive of patient outcome. The results of both 
supervised and Unsupervised analyses indicated that transcriptional profiles in the surrogate 
tissue of PBMCs from RCC patients prior to treatment with CCI-779 are significantly 
correlated with patient outcome. 

[0080] PBMCs were isolated prior to CCI-779 therapy from peripheral blood of 45 
advanced RCC patients (18 females and 27 males) participating in a phase 2 clinical trial 
study. Written informed consent for the pharmacogenomic portion of the clinical study was 
received for all individuals and the project was approved by the local Institutional Review 
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Boards at the participating clinical sites. RCC tumors of patients were classified at the 
clinical sites as conventional (clear cell) carcinomas (24), granular (1), papillary (3), or 
mixed subtypes (7). Ten tumors were classified as unknown. RCC patients were primarily 
of Caucasian descent (44 Caucasian, 1 African- American) and had a mean age of 58 years 
(range of 40 - 78 years). Inclusion criteria included patients with histologically confirmed 
advanced renal cancer who had received prior therapy for advanced disease, or who had not 
received prior therapy for advanced disease but were not appropriate candidates to receive 
high doses of IL-2 therapy. Other inclusion criteria included patients with (1) bi- 
dimensionally measurable evidence of disease; (2) evidence of progression of the disease 
prior to study entry; (3) an age of 18 years or older; (4) ANC > 1500/pL, platelet > 
100,000/jxL and hemoglobin > 8.5 g/dL; (5) adequate renal function evidenced by serum 
creatinine < 1,5 x upper limit of normal; (6) adequate hepatic function evidenced by 
biliruubin < 1.5 x upper limit of normal and AST < 3x upper limit of normal (or AST < 5x 
upper limit of normal if liver metastases were present); (7) serum cholesterol < 350 mg/dL, 
triglycerides < 300 mg/dL; (8) ECOG performance status 0-1; and (9) a life expectancy of 
at least 12 weeks. Exclusion criteria included patients who had (1) the presence of known 
CNS metastases; (2) surgery or radiotherapy within 3 weeks of start of dosing; (3) 
chemotherapy or biologic therapy for RCC within 4 weeks of start of dosing; (4) treatment 
with a prior investigational agent within 4 weeks of start of dosing; (5) 
immunocompromised status including those known to be HTV positive, or receiving 
concurrent use of immunosuppressive agents including corticosteroids; (6) active infections; 
(7) required treatment with anticonvulsant therapy; (8) presence of unstable 
angina/myocardial infarction within 6 months/ongoing treatment of life-threatening 
arrythmia; (9) history of prior malignancy in past 3 years; (10) hypersensitivity to macrolide 
antibiotics; and (11) pregnancy or any other illness which would substantially increase the 
risk associated with participation in the study. 

[0081] These advanced RCC patients were treated with one of 3 doses of CCI-779 
(25 mg, 75 mg, or 250 mg) administered as a 30 minute intravenous (TV) infusion once 
weekly for the duration of the trial. CCI-779 is an ester analog of the immunosuppressant 
rapamycin and as such is a potent, selective inhibitor of the mammalian target of rapamycin. 
The mammalian target of rapamycin (mTOR) activates multiple signaling pathways, 
including phosphorylation of p70s6kinase, which results in increased translation of 5* TOP 
mRNAs encoding proteins involved in translation and entry into the Gl phase of the cell 
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cycle. By virtue of its inhibitory effects on mTOR and cell cycle control, CCI-779 
functions as a cytostatic and immunosuppressive agent. 

[0082] Clinical staging and size of residual, recurrent or metastatic disease were 
recorded prior to treatment and every 8 weeks following initiation of CCI-779 therapy. 
Tumor size was measured in centimeters and reported as the product of the longest diameter 
and its perpendicular. Measurable disease was defined as any bidimensionally measurable 
lesion where both diameters > 1 .0 cm by CT-scan, X-ray or palpation. Tumor response was 
determined by the sum of the products of all measurable lesions. The categories for 
assignment of clinical response were given by the clinical protocol definitions (i.e., 
progressive disease, stable disease, minor response, partial response, and complete 
response). The category for assignment of prognosis under the Motzer risk assessment 
(favorable vs intermediate vs poor) was also used. Among the 45 RCC patients, 6 were 
assigned a favorable risk assessment, 17 patients possessed an intermediate risk score, and 
22 patients received a poor prognosis classification. In addition to the categorical 
classifications, overall survival and time to disease progression were also monitored as 
clinical endpoints. 

[0083] HgU95A genechips (manufactured by Affymetrix) were used to detect 
baseline expression profiles in PBMCs of the RCC patients prior to the CCI-779 therapy. 
Each_.HgU95A genechip comprises over 12,600 human sequences according to the 
Affymetrix Expression Analysis Technical Manual. RNA transcripts were first isolated 
from PBMCs of the RCC patients. cRNA was then prepared and hybridized to the 
genechips according to protocols described in the Asymetrix's Expression Analysis 
Technical Manual. Hybridization signals were collected, scaled, and normalized before 
being subject to further analysis. In one example, the log of the expression level for each 
gene was normalized across the samples such that the mean is zero and the standard 
deviation is one. 

[0084] The expression profiling analysis revealed that of the 12,626 genes on the 
HgU95 A chip, 5,424 genes met the initial criteria (i.e., at least 1 present call across the data 
set and at least 1 frequency > 10 ppm). On average, 4,023 transcripts were detected as 
"present" in any given RCC PBMC profile. 

[0085] In an initial assessment of the expression data in baseline PBMCs, pairwise 
correlations were calculated to assess the association between gene expression levels 
measured by HgU95A Affymetrix microarrays and continuous measures of clinical 
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outcome. Correlations were run using expression levels from each of 5,424 qualifiers that 
passed the initial criteria. Correlations were run for two clinical measures (TTD and TTP) 
and for one measure of baseline expression level (log2-transformed scaled frequency in 
units ofppm). 

[0086] In one example, Spearman's rank correlations were computed. The p-value 
for the hypothesis that the correlation was equal to 0 was calculated for each pairwise 
correlation. For each comparison between clinical outcome and gene expression, the 
number of tests that were nominally significant out of the 5,424 tests performed was 
calculated for five Type I (i.e. false-positive) error levels. To adjust for the fact that 5,424 
non-independent tests were performed, a permutation-based approach was employed to 
evaluate how often the observed number of significance tests would be found under the null 
hypothesis of no correlation. 

[0087] The overall results for Spearman's rank correlation comparisons of clinical 
outcome with baseline expression levels (log2-transformed scaled frequency) are 
summarized in Tables 2a and 2b. Each table shows alpha confidence levels ("a"), the 
observed numbers of transcripts that have nominally significant Spearman correlations with 
the clinical outcome of interest ("Observed Number"), and the percentage of permutations 
for which number of nominally significant Spearman correlations equals or exceeds the 
number observed ("%-age of Permutations"). Evidence for association between clinical 
outcome and baseline gene expression in PBMCs was significant for both TTD and TTP. 



Table 2a. Spearman Correlations of Clinical Outcome with Baseline Ex pression Levels in 
PBMCs of RCC Patients in CCI-779 Therapy (n = 45 patients) 



Time to Disease Progression 


a 


Observed Number of 
Nominally 
Significant 
Spearman 
Correlations* 


%-age of Permutations for which Number of 
Nominally Significant Spearman Correlations 
equals or exceeds observed number 


0.1 


1127 


5.3% (53/1000) 


0.05 


749 


3.8% (38/1000) 


0.01 


248 


3.1% (31/1000) 


0.005 


159 


2.6% (26/1000) 


0.001 


51 


2.5% (25/1000) 



* based on 5,424 genes (filtered by at least one Present and at least one frequency > 10 
ppm) 
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Table 2b. Spearman Correlations of Clinical Outcome with Baseline Expression Levels in 
PBMCs of RCC Patients in CCI-779 TherapvYn = 45 patients) 



Time to Death 


a 


Observed Number of 

Nominally 
Significant Spearman 
Correlations* 


yo-age oi rermuiauons ior wnicn i\umoer 

of Nominally Significant Spearman 
Correlations equals or exceeds observed 
number 


0.1 


1604 


0.1% (1/1000) 


0.05 


1117 


0.1% (1/1000) . 


0.01 


436 


0.1% (1/1000) 


0.005 


289' 


0.1% (1/1000) 


0.001 


105 


0.3% (3/1000) 



* based on 5,424 genes (filtered by at least one Present and at least one frequency > 1 0 



[0088] Table 3 lists the results of the Spearman's rank correlation analyses for all of 
the 5,424 genes that met the initial criteria. Each gene has a corresponding qualifier on the 
HgU95A genechip, and each qualifier represents multiple oligonucleotide probes that are 
stably attached to discrete regions on the HgU95 A genechip. According to the design, RNA 
transcripts of a gene, or the complements thereof, are expected to hybridize under nucleic 
acid array hybridization conditions to the corresponding qualifier on the HgU95A genechip. 
As-used^hereinrra^olynucleotide-can hybridize to a qualifier if the polynucleotide, or the 
complement thereof, can hybridize to at least one oligonucleotide probe of the qualifier. In 
many embodiments, the polynucleotide or the complement thereof can hybridize to at least 
50%, 60%, 70%, 80%, 90% or 100% of all of the oligonucleotide probes of the qualifier. 
[0089] Each gene or qualifier in Table 3 may have a corresponding SEQ ID NO or 
Entrez accession number from which the oligonucleotide probes of the qualifier can be 
derived. In many instances, a polypeptide capable of hybridizing to a qualifier can also 
hybridize to the sequence of the corresponding SEQ ID NO or Entrez accession number, or 
the complement thereof The sequence of each Entrez accession number can be obtained 
from the Entrez nucleotide database at the National Center of Biotechnology Information 
(NCBI). The Entrez nucleotide database collects sequences from several sources, including 
GenBank, RefSeq, and PDB. Each SEQ ID NO may be derived from the sequence of the 
corresponding Entrez accession number. Table 4 shows the Entrez and Unigene accession 
numbers for all of the qualifiers on the HgU95A genechip that met the initial criteria. 
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[0090] Any ambiguous residue ("n") in a SEQ ID NO can be determined by a variety 
of methods. In one embodiment, the ambiguous residues in a SEQ ID NO are determined 
by aligning the SEQ ID NO to a corresponding genomic sequence obtained from a human 
genome sequence database. In another embodiment, the ambiguous residues in a SEQ ID 
NO are determined based on the sequence of the corresponding Entrez accession number. 
In yet another embodiment, the ambiguous residues are determined by re-sequencing the 
SEQ ID NO. 

[0091] Genes associated with each qualifier on the HgU95A genechip can be 
identified based on the annotations provided by Affymetrix. All of the genes thus identified 
are listed in Tables 3 and 5. These genes can also be identified based on their 
corresponding Entrez or Unigene accession numbers. In addition, these genes can be 
determined by BLAST searching their corresponding SEQ ID NOs, or the unambiguous 
segments thereof, against a human genome sequence database. Suitable human genome 
sequence databases for this purpose include, but are not limited to, the NCBI human 
genome database. The NCBI provides BLAST programs, such as "blastn," for searching its 
sequence databases. 

[0092] In one embodiment, the BLAST search of the NCBI human genome database 
is carried out by using an unambiguous segment (e.g., the longest unambiguous segment) of 
a SEQ ID NO. Gene(s) that aligns to the unambiguous segment with significant sequence 
identity can be identified. In many cases, the identified gene(s) has at least 95%, 96%, 97%, 
98%, 99%, or more sequence identity with the unambiguous segment. 
[0093] On the basis of Spearman's rank correlation, prognosis genes that are highly 
correlated with TTP or TTD were identified. Table 6a lists examples of genes whose 
expression levels are positively correlated with TTP. Table 6b depicts examples of genes 
whose expression levels are negatively correlated with TTP. Table 6c provides examples of 
genes whose expression levels are positively correlated with TTD. Table 6d shows 
examples of genes whose expression levels are negatively correlated with TTD. Correlation 
coefficients, p-values, and the corresponding qualifiers are also indicated for each gene in 
Tables 6a, 6b, 6c, and 6d. 
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Table 6a. Prognosis Genes Positively Correlated with TTP 



HgU95A Qualifier 


Correlation Coefficient 


P-Value 


Gene Name 


3851 8_at 


0.6019 


0.0000 


SCML2 


37343_at 


0.5932 


0.0000 


1TPR3 


41174_at 


0.5925 


0.0000 


RANBP2L1 . 


4l669_at 


. 0.5908 


0.0000 


KIAA0191 


40584_at 


0.5602 


0.0001 


NUP88 


41767_r_at 


0.5591 


0.0001 


K1AA0855 


38256_s_at 


0.5551 


0.0001 


DKPZP564O092 


39829 at 


0.5508 


0.0001 


ARL7 


35802_at 


0.5475 


0.0001 


KIAA1014 


32169_at 


0.5407 


0.0001 


KIAA0875 


41562_at 


0.5272 


0.0002 


BMI1 


35753_at 


0.5226 


0.0002 


PRP8 


40905_s_at 


0.5223 


0.0002 


DKFZP566J153 


41547_at 


0.5189 


0.0003 


BUB3 


37416_at 


0.5177 


0.0003 


ARHH 


37585_at 


0.5157 


0.0003 


SNRPA1 


34716 at 


0.5143 


0.0003 


TASR 


32183_at 


0.5034 


0.0004 


SFRS11 




0.4977 


-0:0005- 


CA150 


39426 z at - 






35815_at 


0.4975 


0.0005 


HYPB 


36403_s_at 


0.4972 


0.0005 


UNKAI434146 


40828_at 


0.4963 


0.0005 


P85SPR 


35364_at . 


0.4947 


0.0006 


APPBP1 


33861_at 


0.4931 


0.0006 


UNK_AI1 23426 


36474_at 


0.4927 


0.0006 


KIAA0776 


35764_at 


0.4908 


0.0006 


CXORF5 


39129_at 


0.4904 


0.0006 


UNK_AF052134 


32508_at 


0.4893 


1 0.0006 


KIAA1096 


35842_at 


0.4862 


0.0007 


UNK.AL049265 


41737_at 


0.4862 


0.0007 


SRM160 


36303_f_at 


0.4833 


0.0008 


ZNF85 


34256_at 


0.4829 


0.0008 


SIAT9 


33845_at 


0.4828 


0.0008 


HNRPH1 


40048_at 


0.4822 


0.0008 


UNK D43951 i 
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P-Value 


\jrclic i^<tiiiv 




0 4801 


0 0008 


TRF4 

IXvTH i 




0 4779 


0 0009 


UNFIT AAR874R0 


2000 at 


0.4777 


0.0009 


ATM 


37078_at 


0.4760 


0.0010 


CD3Z 


38778_at 


0.4744 


0.0010 


KIAA1046 



Table 6b. Prognosis Genes Negatively Correlated with TTP 



HgU95A Qualifier 


Correlation Coefficient 


P-Value 


Gene Name 


935_at 


-0.6319 


0.0000 


CAP 


34498_at 


-0.5385 


0.0001 


VNN2 


37023_at 


-0.5292 


0.0002 


LCP1 


286 at 


-0.5189 


0.0003 


H2AFO 


38831_f_at 


-0.5152 


0.0003 


UNKAF053356 


268_at 


-0.5126 


0.0003 


PECAM1 


38893_at 


-0.5006 


0.0005 


NCF4 


34319_at 


-0.4950 


0.0005 


S100P 


37328_at 


-0.4931 


0.0006 


PLEK 


181 g at 


-0.4925 


0.0006 


UNK_S82470 


38894 _g_at 


-0.4852 


0.0007 


NCF4 


32736_at 


-0.4805 


0.0008 


UNK_W68830 



Table 6c. Prognosis Genes Positively Correlated with TTD 



HgU95A Qualifier 


Correlation Coefficient 


P-Value 


Gene Name 


37385_at 


0.6524 


0.0000 


CYP 


41606_at 


0.6155 


0.0000 


DRG1 


33420 g at 


0.6043 


0.0000 


API5 


35353_at 


0.5969 


0.0000 


PSMC2 


38017_at 


0.5942 


0.0000 


CD79A 


31851_at 


0.5854 


0.0000 


RFP2 


35319_at 


0.5817 


0.0000 


CTCF 


38702_at 


0.5702 


0.0000 


UNK_AF070640 


36474_at 


0.5654 


j 0.0001 


KIAA0776 


34256_at 


0.5649 


i o.oooi 


SIAT9 


34763_at 


0.5575 


0.0001 


CSPG6 


3383 l_at 


0.5561 


o.oooi 


CREBBP 
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HglJ95A Qualifier 


Correlation Coefficient 


r- value 


Gene Name 


229_at 


0.5499 


a aaai 
U.UUUl 


Cx3rz 


3738l_g_at 


0.5478 


A AAAI 
U.UUUl 




40092_at 


0.5436 


A AAA1 


BAZzA 


39746_at 


0.5428 


A AAA1 
U.UUUl 


FULK2B 


4ll74_at 


0.5424 


A AAA! 
U.UUUl 


tj AXTDDOT 1 

KAJNBriLi 


32508_at 


0.5397 


U.U0U1 


T/"T A A 1 AA/C 


33403__at 


0.5390 


A AAA1 
U.UUUl 


T^VC ,r 7TXyf'7T71 A1 A 

UKr Zr 04 / Jb 1 U 1 U 


39809_at 


0.5381 


A AAAI 
U.U001 


HBrl 


34829__at 


0.5373 


0.0001 


DKC1 


37625_at 


0.5350 


f\ AAAA 

0.0002 


IRF4 


35656_at 


0.5336 


0.0002 


T» XTT7 

RNF6 


39509_at 


0.5328 


r\ n AA^ 

0.0002 


UNK_AI692348 


33543_s_at 


0.5324 


0.0002 


PNN 


38082_at 


0.5318 


0.0002 


ITT X A A/Tf A 

KIAA0650 


36303_fat 


0.53H 


0.0002 


r 7XTT70 C 

ZNF85 


1885_at 


0.5300 


/\ A/-\AA 

0.0002 


ERCC3 


32194_at 


0.5285 


0.0002 


CBF2 


41621_i_at 


0.5264 


0.0002 


ZNF266 


33151_s_at 


0.5239 


0.0002 


UNK_W25932 


32169_at 


0.5212 


0.0002 


KIAA0875 


36845_at 


0.5203 


0.0002 


KIAA0136 


3623 l_at 


0.5 1 97 


0.0003 


T TKTV A PAAOAT5 

UNK_AC002073 


35163_at 


0.5 1 72 


0.0003 


ttt A A 1 Ail 1 

KIAA1041 


40905_s_at 


0.5170 


0.0003 


TM/"C ,f 7T>Ci^irT1 CO 

DKFZF5ooJlo3 


3943 l_at 


0.5164 


0.0003 


NrrJrrb 


4l669_at 


0.5160 


0.0003 


TTT A A A1 A1 

KIAA0191 


35294 at 


0.5150 


A A A AO 

0.0003 


CO A O 


3940 l_at 


0.5139 


A AAAO 

0.0003 


UlNlv_WZo/04 


34716_at 


0.5137 


A AAAO 

0.00U3 


T> A Ct> 

lAoK 


40563_at 


0.5136 


A AAAO 

0.00U3 


TWCr7DC</l A A>iQ. 

Divr Zr j 04 AU4i 


38667_at 


0.5124 


A AAAO 

U.UUU3 


T TXT V AA10O1A1 


38122_at 


A CI A*7 

0.5107 


A AAAO 

U.UUUi 




j / JO*J_ at 


0 S0Q6 

V.JV7U 


0 0004 


SNRPA1 


32183_at 


0.5079 


0.0004 


SFRS11 


40816_at 


0.5074 


0.0004 


PWP1 



27 



WO 2004/097052 



PCT/US2004/013587 



HgU95A Qualifier 


Correlation Coefficient 


x -value 


Gene Name 


33818 at 


0.5055 


A AAA/1 

U.UUU4 


T TXT V APAA/lynO 

UJNK ACUU44/Z 


37703_at 


0.5042 


a r\f\f\A 
U.UUU4 


KABCju 1 D 


38016_at 


A f AO A 

0.5039 


A AAA/1 

U.UUU4 


T TX7T3 "DT^ 

HNKrJJ 


37737_at 


0.4997 


A AAfiC 


PCM1 1 


36872_at 


0.4976 


A AAAC 

U.UUUj 


ARPP-19 


3941 5_at 


0.4975 


A AAAC 

U.UUUj 


HNKrJv 


40252 _g_at 


0.4970 


A AAAC 


HRB2 


39727_at 


0.4966 


A AAAC ' 

U.UUUj 


DUoPll 


1728_at 


0.4966 


A AAAC 

U.UUU5 


BM11 


34967_at 


0.4956 


A AAAC 

0.0005 


T TXTV A T7A A t C A A 

UNK_AF00154y 


39864_at 


0.4949 


A AAAC 

0.0005 


CIRBP 


32758_g_at 


0.4947 


A AAAZT 

0.0006 


RAE1 


"~35753_at 


0.4943 


A AAA^ 

0,0006 


PRP8 


1857_at 


0.4916 


A A AA^" 

0.0006 


MADH7 


35764_at 


0.4915 


0.0006 


CXORF5 


32372_at 


.0.4911 


A AAAZ" 

0.0006 


CTSB 


33485_at 


0.4892 


A AAA^ ' 

0.0006 


RPL4 


34647_at 


0.4887 


A A AAT 

0.0007 


DDX5 


1 1442_at 


0.4886 


A AAA^ 

0.0007 


ESR2 


..41506_at 


0.4875 


A AAA*7 

0.0007 


Ti A A TJT/" A Tit/' C 

MAPKAPK5 


34879_at 


0.4873 


A AAA^ 

0.0007 


DPMI 


39512_s_at 


0.4869 


0.0007 


UNK_AA457U2y 


36783_f_at 


0.4865 


A AAA1 

0.0007 


tt tit 

H-PLK 


35479_at 


/\ j r» ✓'A 

0.4860 


A AAAT 

0.0007 


ADAM28 


40308_at 


0.4858 


A AAAT 

0.0007 


T TKTV A TOO A/1 AA 

UNK Alo3049o 


38462_at 


S\ /I f> C A 

0.4852 


A AAA*7 
0.0007 


XTTYT TC A C 


781_at 


A AOC 1 

0.4851 


A AAAT 
U.UUU / 




38102_at 


0.4850 


A AAA*7 
0.0007 


TTXTV A1/OOC7C 

UJNJv_W2oj / J 


38256_s_at 


A il OA A 

0.4829 


A AAHQ 

U.UUUo 




32850_at 


A HOI T 

0.48 1 7 


A AAAO 

U.UUUo 


"VTT TD1 ca 


. 35286_r_at 


A HOI C 

0.4815 


a Anne 
U.UUUo 


TJV1 
Ki I 


36456_at 


A WO"! C 

0.4815 


a f\f\(\Q 

U.UUUo 


JDlST Zir JD41UDZ 


1QQ1A e of 

joy/4_s_jii 




0 0008 


SSH3BP1 


35805 at 


0.4809 


0.0008 


DKFZP434D156 


40086 at 


0.4805 


0.0008 


KIAA0261 
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HgtJysA aimer 


r , Awnlofmn f\ piTS c\ put 


P-Value 


dene Name 


342 /4_at 


U.'rOl/l 


0 0008 


K1AA1116 

lvlili * A X X v 


jyoy /_at 


U.*T /7J 


0.0009 


DDX16 


4 1 OuD_at 


fi A7Q7 


0 0009 


KTAA0824 


3ol 14_at 


n air^ 

U.*f /OJ 


0 0009 


RAD21 


41lDO_at 


n A7R9 

V.H / OX 


0 0009 

\) m\)\)\l S 


1GHM 


415o9_at 


n /I781 

U.4 fOi 


0 0009 


KTAA0974 


3344U_at 


U.4 / Ih 


0 0009 


TCF8 


36459_at 


U.4/0/ 


0 0009 


KTAA0879 


21o_at 


U.4 /OD 






/iti Art ~ _x 

41199_s_at 


U.4 /OU 


n nnoQ 




4005 l_at 


U.4 /DO 


n nm n 

U.UU1U 


*rtaaoo^7 

.N_L/TurVUU.? / 


3801 9_at 


U.4/D4 


n nni n 




s r\r\ _ a. 

36690_at 


U.4 /40 


n nmn 

U.UU1U 




A < £ A *7 ±. 

41547_at 


U.4 /42 


n nni n 


DUJDJ 


38l05_at 


U.4/34 


a nni n 

U.UUI u 


TTKTXf W76591 


HyJOdtO al 


0 4732 


0.0010 


P85SPR 


4l809_at 


0.4729 


0.0010 


UNK_AI656421 


36210 _g at 


0.4727 


0.0010 


FSRG1 



Table 6d. Prognosis Genes Negatively Correlated with TTD 



HgU95A Qualifier 


Correlation Coefficient 


P-Value 


Gene Name 


286_at 


-0.5871 


0.0000 


H2AFO 


32609_at 


-0.5841 


0.0000 


H2AFO 


38483_at 


-0.5464 


0.0001 


HSA011916 


769 s at 


i -0.5036 


0.0004 


ANXA2 


1131_at 


-0.4876 


0.0007 


MAP2K2 


32378_at 


-0.4818 


0.0008 


PKM2 


956 at 


-0.4770 


0.0009 


TUBB 


3731 l_at 


-0.4760 


0.0010 


TALDOl 


37148_at 


-0.4744 


0.0010 


LILRB3 


36199_at 


-0.4725 


0.0010 


DAP 



[0094] In addition to the specific genes described herein, the present invention 
contemplates the use of any other gene that can hybridize under stringent or nucleic acid 
array hybridization conditions to a qualifier identified in the present invention. These genes 
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may include hypothetical or putative genes that are supported by EST or mRNA data. The 
expression profiles of these genes may correlate with patient clinical outcome. As used 
herein, a gene can hybridize to a qualifier if an RNA transcript of the gene can hybridize to 
at least one oligonucleotide probe of the qualifier. In many cases, an RNA transcript of the 
gene can hybridize to at least 50%, 60%, 70%, 80%, 90%, or more oligonucleotide probes 
of the qualifier. 

[0095] The oligonucleotide probe sequences of each qualifier on HgU95A genechips 
may be obtained from Affymetrix or from the sequence files maintained at Affymetrix 
website "www.affymetrix.com/support/tec hgu95sequence." 
For instance, the oligonucleotide probe sequences can be found in the sequence file 
"HG_U95A Probe Sequences, FASTA" at the website. This sequence file is incorporated 
herein by reference in its entirety. 

[00961 In another example, a Cox proportional hazard regression model was 
employed to assess the correlation between baseline PBMC gene expression levels and 
clinical outcome. Cox model can take into account the effects of censoring on correlations 
of gene expression with TTD (or Survival as of last known date alive) and TTP (or 
progression-free status as of last known date alive). Of the 45 RCC patients with baseline 
PBMC expression levels, 4 had censored data for TTP and 15 had censored data for TTD. 
Similar to the Spearman's assessment of the data, Cox regression can identify genes 
significantly correlated with survival and disease progression for any given ct-confidence 
level. A similar permutation strategy can be used to affirm any correlation between baseline 
expression profiles and clinical outcome. 

[0097] In one embodiment, models were fit using expression levels from each of the 
5,424 qualifiers that passed the initial filtering criteria in the 45 baseline samples. TTP and 
TTD were tested for their association with log2-transformed scaled frequency at baseline. 
A SAS program was used to generate the estimates in Tables 7a and 7b. Tables 7a and 7b 
demonstrate a strong correlation between TTP/TTD and baseline gene expression. 



Table 7a. Cox Regressions of Clinical Outcome on Baseline Expression Levels in PBMCs 
of RCC Patients in CCI-779 Therapy (n = 45 patients^ 



Time to Progression 


V 


Observed Number of 
Nominally Significant 


Percentage of Permutations for 
which Number of Nominally 
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Cox Regressions* 


Significant Cox Regressions 
Equals or Exceeds Observed 
Number** 


0.1 


1439 


0.8% (4/500) 


0.05 


950 


0.8% (3/500) 


0.01 


342 


0.8% (4/500) 


0.005 


217 


0.8% (4/500) 


0.001 


53 


1.0% (5/500) 



* for 5,424 genes (filtered by at least one Present call and at least one frequency £ 10 ppm) 
** based on 500 random permutations 

Table 7b. Cox Regressions of Clinical Outcome on Baseline Expression Levels in PBMCs 
~~~~ "~ of RCC Patients in CCI-779 Therapy (n = 45 patients^ 



Time to Death 


V 


Observed Number of 
Nominally Significant 
Cox Regressions* 


Percentage of Permutations for 
which Number of Nominally 
Significant Cox Regressions - 
Equals or Exceeds Observed 
Number** 


0.1 


1948 


<0.2% (0/500) 


0.05 


1383 


<0.2% (0/500) 


0.01 


602 


<0.2% (0/500) 


0.005 


404 


<0.2% (0/500) 


0.001 


142 


<0.2% (0/500) 



* for 5,424 genes (filtered by at least one Present call and at least one frequency £ 1 0 ppm) 
** based on 500 random permutations 

[0098] Table 8 lists the results of Cox proportional hazard modeling for all of the 
5,424 genes that met the initial criteria. Hazard ratios and p-values (for the hypothesis that 
the risk coefficient was equal to 1, i.e., no risk) are indicated for each gene. Examples of 
genes that are indicative of high risk for TTP or TTD are shown in Tables 9a or 9c, 
respectively. These genes have hazard ratios of at least 3. Examples of genes that are 
indicative of low risk for TTP or TTD are described in Tables 9b or 9d, respectively . These 
genes have hazard ratios of no more than 0.333. 



Table 9a. Prognosis Genes Indicative of High Risk for TTP 



HgU95A Qualifier 


Hazard Ratio 


P-Value 


Gene Name 


37023_at 


6.1066 


0.0001 


LCP1 


935_at 


5.8829 


0.0000 


CAP 


40771_at 


4.9503 


0.0586 


MSN 


37298_at 


4.6595 


0.0046 


GABARAP 
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HpII9SA Oh alifi er 


Ha7ard Ratio 


P-Value 




31820 at 


4.2099 


0.0061 


HfT <?1 


676 at 


4.1051 


0.0016 


IFTTM1 


33906 at 


3 9750 


0.0106 


SSSPA 1 


32736 at 


3 8093 


0 0013 


TTNK W688^n 


40169 at 


3 5692 


0.0243 


TTP47 


39811 at 


3 4197 


0 1074 


UMK" A A4f)9^R 

^ 1 ^ iV^rvrVH U Z J 0 O 


1309 at 


3 3680 


0 0053 




39814 <; at 


3 2703 


0 0029 




38605_at 


3.1625 


0.0592 


NDUFB1 


38831_f_at 


3.0853 


0.0092 


UNK_AF053356 


Table 9b. Prognosis Genes Indicative of Low Risk for TTP 










xlgU^SA v^uaJilier 


Hazard Katio 


Jr- value 


Gene Name 


oy4iD_at 


A AQ1 Q 

U.Uolo 


A AAAO 


xljNKrK 




n i <fiQ 
U.iOUo 


n nnni 
U.UUU1 


DDDO 




U.IOjU 


n neon 


P"DT A 

JrriA 




U.IOJ / 


U«UUZ*t 


niNivrri 1 


36186 nt 


0 1661 






1420 at 


0 1662 


0 0009 




u'l'n^A '~+s4. ' ' 

31950_at 


0.1724 


A A Am 

0.0071 


PABPC1 


34647_at 


0.1831 


0.0010 


DDX5 


36515_at 


0.2094 


0.0002 


GNE 


36111 s at 


0.2147 


0.0031 


SFRS2 


39180_at 


0.2154 


0.0009 


FUS 


32758__g^_at 


0.2186 


0.0010 


RAE1 


31952_at 


0.2211 


0.0076 


RPL6 


38527_at 


0.2258 


0.0016 


NONO 


3283 l_at 


0.2298 


0.0006 


TIM17 


37609_at 


0.2321 


0.0016 


NUBP1 


34695_at 


0.2330 


0.0035 


GA17 


39730_at 


0.2331 


0.0005 


ABL1 


35808_at 


0.2385 


0.0037 


SFRS6 


3275 l_at 


0.2386 


0.0013 


UNK.AF007140 


41737_at 


0.2393 


0.0023 


SRM160 


32205 at 


0.2431 


0.0009 


PRKRA 
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*t tiap a 4*x i » a" 1 ! _ 

HgU95A Qualifier 


Hazard Ratio 


P-Value 


Gene Name . 


40252_g_at 


0.2473 


0.0033 


HRB2 


35325_at 


0.2540 


0.0030 


T TXT V ATJAf ^1 1 1 

U1N K_AF05 2113 


41292_at 


0.2549 


A A A 1 A 

0.0014 


HNRPH1 


3265 8_at 


0.2553 


0.0010 


T TXTV A T r\0 1 O 

UNK_AL031228 


33307_at 


0.2569 


0.0008 


UNK_AL02231o 


40426_at 


0.2587 


0.0306 


BCL7B 


41562_at 


0.2595 


0.0010 


ATT 1 

BMI1 


34315_at 


0.2638 


0.0149 


AFG3L2 


33920_at 


0.2665 


0.0549 


T*\T A T\TT1 

DIAPHl 


33706_at 


0.2698 


0.0114 


SARTl 


35170 at 


0.2706 


0.0053 


MAN2C1 


229_at 


0.2715 


0.0064 


CBF2 


33485_at 


0.2724 


0.0169 


RPL4 


1728_at 


0.2736 


0.0103 


BMIl 


38105_at 


0.2748 


0.0017 


UNK_W26521 


1361_at 


0.2801 


0.0059 


TERF1 


32171_at 


0.2831 


0.0040 


EIF5 


36456_at 


0.2834 


0.0015 


DKFZP564I052 


838_s_at 


0.2841 


0.0616 


UBE2I 


1706_at 


0.2852 


0.0144 


ARAF1 


38778_at 


0.2882 


0.0012 


KIAA1046 


39378_at 


0.2896 


0.1463 
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